Fast protocols for local implementation of bipartite nonlocal unitaries 
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In certain cases tiie communication time required to deterministically implement a nonlocal bi- 
partite unitary using prior entanglement and LOCC (local operations and classical communication) 
can be reduced by a factor of two. We introduce two such "fast" protocols and illustrate them with 
various examples. For some simple unitaries, the entanglement resource is used quite efficiently. The 
problem of exactly which unitaries can be implemented by these two protocols remains unsolved, 
though there is some evidence that the set of implementable unitaries may expand at the cost of 
using more entanglement. 
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I. INTRODUCTION 

A central problem in quantum information theory is 
that of interconversion between resources, for example, 
how to use communication channels to produce entangle- 
ment between distant parties, or how to use such entan- 
glement to carry out nonlocal operations. In particular, 
the use of prior entanglement assisted by classical com- 
munication to carry out nonlocal unitaries has been the 
subject of various studies [H-Q; for a more extensive list 
see Ref. Q. 

In this paper we add time as a resource to be con- 
sidered along with entanglement cost when construct- 
ing protocols for bipartite nonlocal unitaries (nonlocal 
gates). The ability to implement nonlocal unitaries 
rapidly may be particularly relevant in the context of 
distributed quantum computation 0-0]) where less time 
consumption means less decoherence; or in position- 
based quantum cryptography (8l-[ll|. where it may allow 
certain position verification schemes to be broken. 

The usual protocols for bipartite unitaries, such as 
those in Ref. have the following general structure: 
Alice carries out local operations and measurements, and 
sends the measurement results through a classical com- 
munication channel to Bob, who then carries out corre- 
sponding operations and measurements, and sends the 
measurement results back to Alice using classical com- 
munication. Finally, Alice performs additional local op- 
erations that may depend on the previous measurement 
results of both parties. When the distance between the 
two parties is large the total time required for the proto- 
col will be dominated by the two rounds of communica- 
tion, thus double the minimum time for a signal to pass 
from one to the other. However, there exist nonlocal uni- 
taries which can be implemented by a protocol in which 
Alice and Bob carry out local operations and measure- 
ments at the same time, and then simultaneously send 
the results to the other party, and finally perform local 
operations depending on the received messages to com- 



plete the protocol. This reduces the total communication 
time by a factor of two0. We are interested in identifying 
which bipartite unitaries can be carried out using such 
a fast protocol, and also in finding the associated entan- 
glement cost. The crucial distinction between a fast and 
slow protocol of the form considered here is that for the 
latter. Bob needs to wait for a message from Alice before 
choosing the basis in which to carry out his measure- 
ment ( "choosing the measurement basis" is equivalent to 
choosing what local gates to do before his measurement), 
whereas in the former this basis can be fixed in advance. 

We have identified two classes of nonlocal unitary that 
lend themselves to a fast protocol: controlled unitaries of 
the form shown in ([1]) below, and group unitaries of the 
form shown in ([9]) . The slow versions of both were con- 
sidered in our previous work [3j], where we showed that 
controlled unitary protocols, while useful for understand- 
ing how such protocols work, can always be replaced by 
group unitary protocols that make use of the same re- 
sources. Our fast protocols represent special cases (i.e., 
special groups and parameter choices) of the slow proto- 
cols discussed previously, and once again the controlled 
kind can be replaced by the group kind. By increasing 
the amount of entanglement expended, additional uni- 
taries can be carried out using these fast protocols. In 
some cases this allows an arbitrarily close approximation 
to a unitary which cannot be carried out exactly by these 
methods. A still more general class of slow protocols, cor- 
responding to Eq. (18) in [sj, also has a fast version, but 
we have yet to find examples of unitaries it can carry out 
that cannot be implemented by our other fast protocols. 

The protocols we consider are deterministic — they suc- 
ceed with probability one — and use a definite amount of 
entanglement determined in advance. Such deterministic 
fast protocols have previously been studied by Groisman 
and Reznik ^ for a controlled-NOT (CNOT) gate on 
two qubits, and by Dang and Fan ^l5] its counterpart on 
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^ There may be in practice other temporal costs that need to be 
taken into account, such as that required to produce the initial 
entangled state. We are ignoring these in the present paper. 
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two qudits. In addition, Buhrman et al. [Tl[ and Beigi 
and Konig 1J| have published approximate schemes for 
what they call "instantaneous quantum computation," 
equivalent to a fast bipartite unitary in our language. 
These protocols, unlike ours, can be used, to approxi- 
mately carry out any bipartite unitary. The one in 
which is based on the nonlocal measurement protocol 
in [l^, has a probability of success less than 1, so it 
is not deterministic, but this probability can be made ar- 
bitrarily close to 1 by using sufficient entanglement. The 
protocol in [ij] uses a fixed amount of entanglement to 
implement with probability 1 a bipartite quantum oper- 
ation (completely positive trace preserving map) which 
is close to the desired unitary, and it can be made arbi- 
trarily close by using sufficient entanglement. The term 
"instantaneous" is not unrelated to the idea of an "in- 
stantaneous measurement" as discussed in [I5l - [l7| , where 
the terminology seems somewhat misleading in that com- 
pleting their protocols actually requires a finite commu- 
nication time, e.g., the parties must send the results to 
headquarters (or to each other) in order to complete the 
identification of the measured state. In the same way, 
"instantaneous quantum computation" actually requires 
a finite communication time, the same as in our fast pro- 
tocol. 

The paper is organized as follows. In Sec. |lT] we con- 
sider controlled unitaries of the form Eq. ([T]) , where the 
unitaries being controlled form an Abelian group. (Ap- 
pendix |X] contains an argument, which may be of more 
general interest, that allows projectors Pk in this for- 
mula to be replaced by projectors of rank 1.) In addi- 
tion we show how subsets of the collection of unitaries 
representing an Abelian group can be employed to gen- 
erate fast unitaries otherwise not accessible by our pro- 
tocol. Section [ITT] is devoted to group unitaries of the 
form ([SJ, including a significant number of examples. We 
also present an argument showing that the controUed- 
Abelian-group unitaries of Sec.|TT]can be transformed to 
group unitaries of the form The concluding Sec. ITVl 
contains a brief summary along with an indication of 
some open problems. 



II. FAST PROTOCOL FOR 
CONTROLLED- ABELIAN-GROUP UNITARIES 

In this section we construct a fast protocol for any 
controlled- Abelian-group unitary of the form 



implementing 
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(1) 



fc=0 



where the Pk are orthogonal projectors, possibly of rank 
greater than I, on a Hilbert space T-La of dimension cIa- 
The {Vk} are unitary operators on a Hilbert space of 
dimension d^, that form a representation of an Abelian 
group G of order N . As shown in Appendix IA 21 it suffices 
to consider projectors of rank 1. That is, a scheme for 



Af-l 



u = Y^\k){k\^ 



(2) 



fe=0 



where | fc) denotes a ket belonging to a standard (or com- 
putational) orthonormal basis, is easily extended to one 
that carries out the more general (H]) . In addition we shall 
consider cases. Sec. Ill Cl in which the {Vk} form a subset 
of an Abelian group, with the sum in ([T]) restricted to a 
subset S of the integers from to — 1 . 



A. Fast protocol for controlled-cyclic-group 
unitaries 

The simplest Abelian group is a cyclic group, so we 
start with the case where the {Vfc} in ([2]) are a repre- 
sentation of such a group. (It suffices to consider ordi- 
nary representations, since a projective representation of 
a cyclic group is equivalent to an ordinary representation; 
see Sec. 12.2.4 of [3.) It will be convenient to let Vq be 
the identity and T4 = V^ . The slow protocol Q for this 
case, which works for any collection {V^} of unitaries on 
5, is shown in Fig. [TJ where 



(3) 



is a fully entangled state on the ancillary systems a and 
h associated with A and B, respectively, and the gates 
X, Z ^ and F (the Fourier matrix) are defined by 



JV-l 



2nimk/N 



to) (fc|. 



(4) 



Here j e 1 denotes (j - I) mod N, so = |j Q I). 

The symbols resembling "D" in Fig. [T] represent measure- 
ments in the standard basis. 

The slow protocol proceeds by Alice carrying out the 
operations indicated on the left side of Fig. [U and then 
sending the outcome I of the measurement on a to Bob 
over a classical channel. He uses it to carry out a gate 
X\ followed by the other operations in the center of the 
figure. His final measurement outcome m is sent to Alice 
over another classical channel, who uses it to perform an 
additional gate Z„j = Z^'" that completes the protocol. 

A faster protocol can be constructed if the two rounds 
of classical communication can be carried out simultane- 
ously instead of consecutively. This is possible if Bob can 
carry out various operations, including a measurement, 
in advance of receiving the value of / from Alice, as in 
Fig. [21 The classical signals can then be sent simulta- 
neously, and the protocol is completed when both Alice 
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FIG. 1. The slow protocol in Sec. Ill of ^ for implementing the unitary W = JZ^q^ ® ^fc- 
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FIG. 2. The fast protocol for implementing the unitary U = X^^Lo^ ® ^k- 



and Bob make final corrections that depend on the sig- 
nals they receive. In order to change the slow protocol 
into a fast protocol, one must, in effect, push the X'- gate 
in Fig. [1] through the two gates that follow it in order to 
arrive at the situation in Fig. O The two steps are as 
follows: 



1, 



Commute X^ with the controlled- 14 gate: the X^ 
itself passes through the control node unchanged, 
but leaves the V gate controlled by state \ j) instead 
of \k), so Vj instead of Vk acts on B. This can be 
compensated at the end of the protocol by a local 
unitary correction of V/, with I = j Q k (see the 
discussion below). 

Commute X^ with F: we have that FX' = Z^^F, 
and since the are diagonal unitaries, they do 
not affect the measurement result m in the stan- 
dard basis, and thus is absent from the fast 
protocol in Fig.[2j (Due to the removal of there 
is an unimportant global phase, dependent on / and 
m, that is introduced in the implementation of U.) 



The final correction Vj^ is possible because the {Vk} 
form a cyclic group: The net operation on B is V/^Vj — 
VjQi — VjQ(jefc) — Vk, where / — j Q k follows from 
Fig. [T] and the definition of the gate. This is an extra 
restriction over the slow protocol, where the Vk can be 
arbitrary unitaries. 

Example 1. 

Case (i). U is the iV-dimensional CNOT gate, with N = 
d = (Ia = ds, and 14 = X'' form a cyclic group. This is 
the class of unitaries implementable by Dang and Fan's 



protocol shown in Fig. 2 of [iSj . 

Case (ii). 14 is a. controlled unitary of the form ^ with 
N ^dA = 3,dB^2, Vk - diag(l,e2-*'=/3), k = 0,1,2. 
(The Vk are not shift operators, so this is more general 
than the protocol in |13j.) 



B. Generalization to a controlled- Abelian-group 

The fast protocol for controlled-cyclic-group unitaries 
is easily generalized to the case where the Vk in ([IJ form 
an ordinary representation of an Abelian group G of or- 
der N. Again it suffices (Appendix IA 2\i to consider the 
case ([2]) where the Pk are rank 1 projectors. Any finite 
Abelian group is the direct sum (direct product) of ry cy- 
cles, and it is convenient to adopt a label k for elements 
of G that reflects this structure, by thinking of it as an 
77-tuplc of integers. 



(fci, fc2, . . . kr,), 



(5) 



with < ki < — 1, where is the length of 
the i-th cycle. In this way group multiplication, with 
k = (0,0, ...0) the identity, is the same as vector ad- 
dition, modulo Ti for the i-th component. Similarly, 
the j labels on the systems a and b in Fig. [51 and the 
measurement outcomes / and m, can also written as 77- 
tuples: j = {ji, 32, ■ ■ ■ j-q), etc. In the following we will 
make use of the inner product of two ry-tuples such as 

(j ■ m) = ELi.?i™i- 

The X, Z, and F gates are now appropriate tensor 
products of the cyclic group gates in (|4]), for example, 
X'' understood as 0^^^ Xf' and X''\j) = \j Q k), using 
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the obvious 77-tuple definition of j fc. The Zm gate in 
Fig. [5] is the tensor product of the gates for the 

different cycles, 

Z„ = ^e-2"*('="")/^|A:)(fc|. (6) 

k 

Here is why the protocol works. Assume an initial 
product state \k) ® \uJk) on 'Ha^'Hb- Then the operator 
implemented on B is ViVj — Vj^j^Vj = VkQjVj = Vk- 
The F gate on b before the measurement gives rise to 
a phase g^'^^U-"^)/^ ^ which is partially compensated by 
the phase Q-'^^^ik-m)/N -^^ ^^le Zm gate on A, and since 
j = k(Bl,we are left with an overall phase of e^'^'C '")/^. 
Since this phase is independent of fc, a superposition of 
initial product states of this form for different k will also 
be transformed hy U, up to an overall phase that is of no 
concern. 

Note that the Vk themselves may, but need not, be 
tensor products, as illustrated in the following example. 

Example 2. 

Case (i): dyi = = 4. The Vk defined by 

V(o,o) = diag(l, 1, 1, 1), V(o,i) = diag(l, 1, -1, -1), 
V(i,o) = diag(l, -1, 1,-1), V(i,i) = diag(l, -1, -1, 1), 

(7) 

are tensor products, and form a group C2 x C2. If one 
regards Ha as well as "Hs as a tensor product of two 
qubits, the U defined by ^ is itself a tensor product 
U =Ui®U2, with each factor a controlled-cyclic group 
unitary with one qubit on the A and the other on the B 
side. Thus U can be implemented by an overall protocol 
which is just two smaller protocols running in parallel 
with each other, one for Ui and the other for U2- 

Case (ii): ds = 3. Modify the Vk in ^ by keeping 
only the first three rows and columns, so they are no 
longer tensor products, though they still form a group 
C2 X C2. Consequently, the protocol that carries out U 
can no longer be viewed as two smaller protocols running 
in parallel. 

C. Controlled subset of an Abelian group 

Assume that the {Vk} in ([T]) form an ordinary represen- 
tation of an Abelian group of order N , but the sum over 
k is restricted to some subset S of the set of A'' ry-tuples 
defined in the last subsection. It will suffice once again 
to consider the case of rank-one projectors, i.e., The 
j, Z, and m in Fig. [2] run over the same range as before, 
but k is restricted to the set S. Therefore the dimension 
= n of T-La is less than the Schmidt rank of |$), which 
is the order A'' of the group. It is convenient to use the 
elements of S to label the kets that form the basis of T-La 
in (21) [corresponding to the projectors in ([T])]. The oper- 
ator Zm is now given by ([S]), but with k restricted to S. 
The reason that the fast protocol in Fig. [2] will work in 



this case is the same as given above in Sec. IIIBi the fact 
that k is restricted to a subset makes no difference. 

The significance of this extension of the result in 
Sec. lIIBl is that it enlarges the class of fast unitaries that 
can be carried out using a protocol of this sort, though 
perhaps with a significant increase in the entanglement 
cost. This is illustrated by the following example, which 
shows that in certain cases one can approximate a con- 
tinuous family of unitaries using sufficient entanglement 
(a large enough N). 

Example 3. 

Consider a unitary on two qubits A and B of the form 
U=\0){0\®I+\l){\\®R, (8) 

where R ~ Vn ~ diag(l, e^^™/^) for some integer 
m < N. By relabeling |1)(1| on A as \m){m\, we see 
that ^ is of the form ^ with k in the sum restricted to 
the two values in S" = {0, m}. Thus U can be carried out 
in a fast way at an entanglement cost of log2 A^ ebits. In 
general, any two-qubit controlled unitary is of the form 
([5]) (up to local unitaries on A and B, before and after 
U) with R = diag(l, e"'^) for some real number (j>. Since cj) 
can be approximated by 2tt multiplied by a rational num- 
ber m/N with large enough N, any two-qubit controlled 
unitary can be approximately implemented (up to local 
unitaries) using this fast protocol by setting R equal to 
diag(l, e^^^*™/^) for suitable m and A^; the entanglement 
cost is again log2 A^ ebits. 

A further generalization to arbitrary cIa and ds is pos- 
sible for any unitary U which is diagonal in a product of 
bases, a basis on Ha and another on Hb- When such a 
diagonal W is written in the form the unitaries Vk are 
diagonal. Each diagonal element of Vk is approximately 
an integer root of unity, hence each Vk is approximately 
an integer root of the identity operator, and the whole set 
{Vk} can be approximated by a subset of an ordinary rep- 
resentation of an Abelian group of sufficient size. Thus 
any bipartite unitary diagonal in a product of bases can 
be approximately implemented by a fast protocol. 

III. FAST PROTOCOL FOR DOUBLE-GROUP 
UNITARIES 

In this section we consider a fast protocol for "double- 
group" unitaries of the form 

u = <f) - E <f) u{f) ® vif), (9) 

where G is a group of order A^, U{f) and V{f) are uni- 
taries on Hilbert spaces Ha and Hb of dimension cIa and 
d-B, respectively, and the operators r(/) = U{f) V{f) 
form a projective representation of G, in the sense that 

rig)T{h) = \ig,h)righ). (10) 

The collection {X{g, h)} of complex numbers of unit mag- 
nitude is known as the factor system. Similarly, {U{f)} 
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and {V{f)} each form a projective unitary representation 
of G with individual factor systems which may differ from 
one another, whose pro duct for a given g and h is A(.g; h). 
From Sec. 12.2.1 of [18], for our purposes we can assume 
the factor system {X{g, h)} is standard, that is, 

A(e, e) = A(e, /) = A(/, e) = 1, V/ G G, (11) 

where e is the identity element in G. 

A. Protocol 

The slow protocol in Sec. IV D of Q for implementing 
unitaries of the type ([9|) is shown in Fig. |3l The two 
parties share a maximally entangled state 

l*)-^El/)"®l/)'" (12) 

on the ancillary systems a, b, each of dimension N, the 
order of G. Alice and Bob perform controlled-C/ (/) and 
controlled- T^(/) gates on aA and bB, respectively. Alice 
follows this with a T gate on a, where in the standard 
basis T is a complex Hadamard matrix T divided by y/N, 
that is, a unitary with all elements of the same magnitude 
1/Vn. Then she does a measurement on a in the stan- 
dard basis and sends the result / to Bob. Bob carries out 
a Zi gate on 5, where each Zi is a diagonal unitary matrix 
whose diagonal elements are the complex conjugates of 
those in the ^-th row of T. Thus T and Zi generalize the 
Fourier gate F and the gate in our previous paper 
0. This more general choice does not extend the set 
of unitaries the slow protocol can implement, since the 
phases in T and Zi cancel each other, but it allows the 
fast protocol to implement a larger set of unitaries than 
would otherwise be possible. 

Next, Bob applies a unitary gate 

J2 K9,9-'f)ci9-'f)\9){f\ (13) 

to b, where X{g, h) is defined in PH)) and the coefficients 
c(/) are those in ([9]). (The coefficients c(/) are not 
uniquely defined by U if the r(/) are linearly depen- 
dent, but there is always at least one choice for which 
C is unitary. Theorem 7 of Q.) Then he measures b 
in the standard basis and sends the result m to Alice. 
To complete the protocol Alice and Bob apply unitary 
corrections {U{'m))^ and {V{m))'^ to their respective sys- 
tems. 

The slow protocol in Fig. |3] can be replaced with the 
fast protocol in Fig. U provided the Zi gate in the former, 
which in effect determines the basis for Bob's measure- 
ment, can be eliminated at the cost of re-interpreting the 
outcome m of his measurement. A sufficient condition for 
this is that for every I there exists a complex permutation 
matrix P/ (exactly one nonzero element of magnitude 1 
in each row and in each column) such that 

CZi = PiC. (14) 



The effect of Pi is simply to permute the measurement 
outcomes, which can then be re-interpreted once I is 
known. The phases in Pi would only introduce a global 
phase for the implemented hi (dependent on I and to) and 
are therefore of no concern. 

A useful procedure for generating C and T matrices 
for which (|14p holds employs the character table K of an 
Abelian group H of order N . This is an iV x matrix, 
all elements of which are of magnitude 1, with columns 
labeled by elements of H and rows by its distinct irre- 
ducible representations, all of which are one-dimensional. 
Because the representations are one-dimensional, each 
row is itself a representation (i.e., the character is the 
representation "matrix"). The element-wise product of 
two columns is another column, since each row is a repre- 
sentation of the group; likewise the element wise product 
of two rows is another row, since the (tensor) product of 
two representations is a representation. Thus the trans- 
pose of a character table is again a character table. The 
complex conjugate of any column (or row) is another col- 
umn (row) of K corresponding to the inverse of the group 
element. The actual order of the columns or the rows is 
arbitrary, though it is often convenient to assume that 
the first row and the first column contain only I's. Since 
K = Kj^pN is a unitary matrix, a character table K 
is a special case of a complex Hadamard matrix: one 
whose elements are all of magnitude 1, and whose rows 
(and columns) are mutually orthogonal. If is a cyclic 
group Cat, then up to permutations of rows or columns 
K — F — \/N F, with F the Fourier matrix ([4]); similarly 
if H is the direct product (sum) of cycles, K is the tensor 
product of the corresponding Fourier matrices [isl]. 

Theorem 1. 

(a) Let K be the character table of an Abelian group 
H of order N , and define 

f = y/NT ^ LK, C = y/NC ^ MKD, (15) 

where L and M are complex permutation matrices, and 
D a diagonal matrix with diagonal elements of magnitude 
1, thus a diagonal unitary. Let Zi be the diagonal matrix 
with diagonal elements equal to the complex conjugates 
of those forming row I of T. Then there exist complex 
permutation matrices Pi such that (|14p is satisfied for 
every I. 

(b) If the rows of an N x N matrix T are linearly 
independent and form a (necessarily Abelian) group H 
up to phases under element-wise multiplication, then T 
is of the form given in (jlSp . 

The proof is in Appendix [B] Note that the group H 
need not be isomorphic to the group G represented by 
{r(/)} although they are of the same order — see Exam- 
ple 8 in Sec. HITC] with n > 2. The matrix f defined 
in (|15p has the property that the rows under element- 
wise products form the Abelian group iJ up to a possible 
phase factor determined by L. 
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FIG. 3. The slow protocol that implements the unitary W in ((9)1. 
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FIG. 4. The fast protocol that implements the unitary 14 in ^ when the conditions given in Theorem [T] are satisfied. The g 
for the final corrections depends on both measurement outcomes I and m. 



One consequence of (ITSI) is 

C = PTD, (16) 

where P is a complex permutation matrix and D is a 
diagonal unitary. When the matrix C is of the form ([T51) , 
the fact that all its elements are of the same magnitude 
as implied by (fT5|) . means that the same is true 
of the c(/). A partial answer to the question of whether 
([T4)) implies that T and C have the form given in ([T5|) is 
provided by the following theorem, whose proof is also in 
Appendix [B] 

Theorem 2. For each I let Zi be the diagonal ma- 
trix whose diagonal elements are the complex conjugates 
of those forming row I of a complex Hadamard matrix 
T ~ ^/N T . If there exists a unitary matrix C without 
a zero element in its first row, and complex permutation 
matrices Pi such that Eq. (1141) holds for every I, then the 
matrices Zi form an Abelian group up to phases, and (fT5|) 
and (|16|) hold. 

It is worth noting that with T and C of the form (fTS)) 
there is a symmetrical version of the fast protocol in 
Fig. m in which the C gate on the B side is replaced 
with T. This requires that the entangled resource |$) be 
changed to 

1$') = (17) 

The reason this works is that the changes produced by 
P in (jl6p in the measurement outcome m can always be 
compensated by altering the function g{l,m) that deter- 
mines the final corrections. The following is a simple 
example. 

Example 4. 



The two-qubit unitary, 

U = ^{lA<S)lB+iZA^ZB), (18) 

where Za and Zb are Pauli az gates on A and B, is 
equivalent under local unitaries to a CNOT gate. It is of 
the form © with G the cyclic group C2 of order 2, and 
can be implemented by the fast protocol in Fig. |4] using 
the matrices 

where the rows of T multiplied by V2 form a group H = 
C2. Because T and C change the measurement basis, 
Alice and Bob effectively perform measurements of ax 
and (Ty, respectively; this is the same thing (wit h the 
parties interchanged) as the fast protocol in [l2|. An 
equivalent symmetrized protocol in which C is replaced 
by T employs a resource state |i>') = -^(lOO) + i|ll)), 
and both parties perform a measurement. 



B. Which unitaries can be carried out using the 
fast protocol? 

Given a particular bipartite unitary U, can it be im- 
plemented using the fast double-group protocol? Any 
such U can always be written in the form Q using a 
sufhciently large group (see Sec. V A of ^]), and typi- 
cally there are many different ways of constructing such 
an expansion. However, for our fast protocol to work, 
assuming that T and C are of the form given in (jl5l) . 
one must find a particular expansion, a particular group 
G and unitaries U{f) and V{f) along with expansion 
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coefficients c(/), that satisfy appropriate conditions. In 
particular, (i) the c(/) must aU be of the same magni- 
tude 1/y/N, as noted following ([T6l) . But two additional 
conditions must be checked: (ii) the matrix C defined in 
(I13p must be unitary, and (iii) C must be related to the 
character table of some group H as in (|15p . Condition 
(iii) can be checked in the following way. Multiply each 
row and then each column of C = ^/N C by a suitable 
phase such that the resulting matrix C' — QCR, where 
Q and R are diagonal unitaries, has I's in the first row 
and first column. Then check whether its rows (alterna- 
tively, its columns) form a group H under component- 
wise multiplication. If this is so, then C is a character 
table of H. Equating it to K in ^S^i, letting D = 
and M — Q^^, and choosing any complex permutation 
matrix L, we arrive at a T which along with this C satis- 
fies the conditions of Theorem[TJ Thus, provided (i), (ii), 
and (iii) are satisfied there is in fact a fast unitary U. 

The scheme just described provides a useful approach 
for constructing examples. Start with a group G and 
(projective) unitary representations {C/(/)} and {V{f)} 
on Ha and Hb, and look for a set of coefficients {c(/)} 
of equal magnitude such that C given by is unitary, 
and satisfies condition (iii) in the preceding paragraph. 
The search is aided by noting that any factor system, see 
Sec. 12.2.2 of [1^, is equivalent to a normalized factor 
system in which each X{g,f) is an A^-th root of 1. A 
consequence, proved in Appendix [Bl is the following: 

Theorem 3. Let T{f) in ^ be a projective represen- 
tation of a group G of order N with a normalized (see 
above) factor system A(g, h), (jlOp . Assume the matrix C 
defined in (|13p is of the form given in (|15p . Then the 
coefficients {c(/)} in (|13p can be written in the form 

c(/) = (7/V]V)exp[2^zfc(/)/iV2]^ (20) 

where k{f) is an integer that depends upon f , and ^ is a 
phase factor independent of f . 

The theorem justifies the following exhaustive, albeit 
tedious, search procedure for possible sets of coefficients 
c(/), assuming they all have the same magnitude, once a 
group G, a projective representation of G, and a normal- 
ized factor system have been chosen. Consider all possi- 
ble sets of coefficients of the form (|20| , setting 7=1 and 
c(e) = l/^/N for the identity e of G, as the global phase 
oiU is unimportant. For each set, check that the matrix 
C given by (fT5|) is unitary. Then see if the rows of the 
corresponding G', constructed as described above, form 
a group under component-wise multiplication. Using this 
procedure we have been able to show that if iV = 2, so 
the group is G2, the only two-qubit unitaries that can 
be implemented by our fast protocol are either trivial 
products of unitaries or else equivalent under local uni- 
taries to a CNOT gate. (Note that there are additional 
fast two-qubit unitaries that can be carried out using a 
bigger group, thus larger N and more entanglement.) 



Both conditions (ii) and (iii) are nontrivial require- 
ments. Not every case in which the c(/) are of equal 
magnitude will lead to a unitary matrix G. For example, 
if c(/) — for every / and {r(/)} is an ordinary rep- 

resentation of G, so X{g,h) — 1, then (fT3|) obviously does 
not define a unitary matrix. And even if G is unitary, 
condition (iii) may not hold. For example, the unitary 
in Eq. (58) of with c(0,0) = c(l,0) = 1/2, c(0, 1) = 
e*"/2, c(l,l) = — e*"/2, assuming a is not an integer 
multiple of 7r/4, results in a G' matrix whose rows do 
not form a group. 

For every ordinary representation of an Abelian group 
G there is a corresponding fast protocol, as the group 
is a direct product (sum) of cycles, and one can apply 
the construction in Example 6 below. For general pro- 
jective representations or non- Abelian groups the matter 
remains open. 

C. Examples 

The examples which follow represent just a few of the 
unitaries that can be carried out by our double-unitary 
fast protocol. Examples 5 and 6 make relatively efficient 
use of entanglement resources, in that the order of the 
group, which is the rank of the fully-entangled resource 
state, is equal to the Schmidt rank (or Schmidt number 
[20| ) of U — the minimum number of summands required 
to represent it as a sum of products of operators on A and 
B. Examples 7 and 8, the latter involving a non- Abelian 
group G, illustrate how the class of fast unitaries can be 
significantly expanded by using more entanglement. 

Note that any two-qubit unitary is equivalent under 
local unitaries to one of the form (see |2l|). 

U = ex.p[i{aax + Pay ® Uy + ^a^ (8 ctz)], (21) 

where a, /3, and 7 are real numbers that can be calculated 
from the matrix olU (see, e.g., the appendix of [25] for 
the method of calculation). For the two-qubit examples 
below we give the values of a, /3, and 7. 

Example 5. 

In the two-qubit unitary 

U = c{0)I (g) I + c{l)X (g) X + c{2)Z ®Z + c{3)XZ ® XZ, 

(22) 

with / the identity, X and Z the Pauli operators and 
az, and G the group G2 x G2, the method of search in- 
dicated in Sec. IIIIBI vields the following possibilities for 
c=(c(0),c(l),c(2),c(3)). 

(a) The case c = (1, 1. 1, — 1)/2 is equivalent to the 
SWAP gate defined in [23|, in which the two qubits are 
interchanged; a = /3 = j = 7T/Am (HI]). An alternative 
fast protocol for this gate consists of teleportation done 
simultaneously in both directions. 

(b) The case c = (1, i, 1, —i)/2 implements the Uxy 
gate as defined in [23|, equivalent under local unitaries 
to the double-CNOT (DCNOT) gate defined in [H; a = 
/3 = 7r/4, 7 = 0. 
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(c) The case c = (1, l,C,C^)/2, where C = e*''/'*; a = 
j3 = 7r/4, 7 = 7r/8. 

In each case the entanglement resource of two ebits re- 
quired to carry out the protocol is the minimum possible 
amount, since the unitary is capable of creating two ebits 
of entanglement. 

Example 6. 

When the {r(/)}, with / an integer between and 
A'^ — 1, form an ordinary representation of the cyclic group 
Cjv of order N , the coefficients 



c(/) 



will provide a fast implementation of In particular 
with [/(/) = V{f) = Z-f this becomes 



reflections. Let 



(l/V]V)cxp(-i7r/7iV), iVeven, 
(l/\/iV)exp(-i7r/(/ + l)/A^), TV odd, 



AT-l 

U=Y^ c{f)Z-f ®Z-f. 
f=o 



(24) 



The method of proof of Theorem 0] can be used to show 
the equivalence of ^ with U = Y.k=o \k){k\ Z'', 
which in turn is locally equivalent to the iV-dimensional 
CNOT gate of Example 1. 

Example 7. 

The unitary 



2V2 



{i®i + X(g)X + cz®z + c^xz (g) xz 



+C^i (g) / + c'^x ® X + c^z ® z + c^xz «) XZ), 



(25) 



of Schmidt rank 4 on two qubits, where C = e"/'' and 
the operators are the same as in Example 5, employs an 
unfaithful (each operator, e.g. X ^ X, appears twice in 
the sum) representation of the Abelian group C2 x C2 x 
C2, with the eight coefficients being the corresponding 
c(/). It can be verified that this {c(/)} set satisfies the 
requirements for the fast protocol. As this group is of 
order 8 the protocol requires a resource of 3 ebits, and 
we have not found any fast protocol which can implement 
this unitary using less entanglement. It corresponds to 
a = 7r/4, /3 = tt/8, 7 = in jH]) (the B gate of 

Example 8. 

For any given integer n > 2, let U{f) = V{f ) (0 < / < 
2n — 1) be the 2x2 matrices 



1: 



< / < n 
n< f <2n-l 



/cos(2/7r/n) — sin(2/7r/n 
I sin(2/7r/n) cos(2/7r/ri) 

cos(2/7r/n) — sin(2/7r/n 
-sin(2/7r/n) cos(2/7r/n) 



(26) 



They form an irreducible ordinary representation of the 
dihedral group -D„ of order N = 2n, where the first kind 
in (1^51) correspond to rotations and the second kind to 



c{f) 



'2n) enpliirmf'^/n ] , 
'2n) exp[i7rTO/(/ + l)/ri] 



n odd. 



(27) 

where m is any positive integer coprime with n, and e(/) 
is 1 for < / < n — 1 and J for / > n. It can be verified 
that these {c(/)} sets satisfy the requirements for the fast 
protocol. The two-qubit unitary constructed in this way 
is locally equivalent to 1^1^ with a = 7r/4, 7 = 0, and /3, 
which necessarily lies in the interval [0, 7r/4], depending 
on m and n in a manner we have not studied in detail. 
It may be that the possible set of /3 values is dense in 
[0,^4]. 



D. Relationship to controlled- Abelian-group 
unitaries 

The following theorem, proved in Appendix \C\ shows 
that the family of unitaries for which fast protocols were 
constructed in Sec. HI] can also be realized using our fast 
protocol for double-group unitaries. The converse is not 
true, since, for instance, the 2-qubit SWAP gate in Ex- 
ample 5 cannot be realized as a controlled- Abelian-group 
unitary, as it is of Schmidt rank 4, while a controlled uni- 
tary on 2 qubits cannot have Schmidt rank greater than 
2. 

Theorem 4. LetU be a controlled- Abelian- group unitary 
of the form ([T]) , where the Vk are a subset of an ordinary 
representation of an Abelian group G of order N . Then 
U is equivalent under local unitaries to 



JV-l 

W = ^c(/)Q(/)®i?(/), 



(28) 



where the c{f) are complex coefficients, the Q{f) are lin- 
ear combinations ofPk 's, and {Q{f)}, {R{f)}, {Q{f)(E> 
i?(/)} are all ordinary representations of the group G. 
In addition the c{f) can be chosen to satisfy the require- 
ments for the fast protocol as given in Sec. \IIIE[ Hence 
all controlled- Abelian- group unitaries of the form dis- 
cussed in Sec. can be implemented by our fast double- 
group unitary protocol, without using more entanglement. 



IV. CONCLUSIONS 

Any nonlocal unitary can be carried out determinis- 
tically by means of local operations and classical com- 
munication provided an appropriate entangled resource 
is available. However, teleportation and various more ef- 
ficient schemes typically require two rounds of classical 
communication, and hence the minimum total amount of 
time required to complete the protocol is twice the time 
required for one-way communication. In certain cases 
there are fast protocols in which the minimum total time 
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is only half as long, and in this paper we have discussed 
two protocols for fast bipartite unitaries. The first is 
shown in Fig. [21 it carries out a controlled- Abelian-group 
unitary of the form ([T]), including cases in which only a 
subset of the collection {Vk} that forms an Abelian group 
appear in the sum. The second, shown in Fig. |4j will 
carry out a double-group unitary of the form pro- 
vided the coefficients c(/) satisfy appropriate conditions. 
We have shown. Sec. IIIIDl that unitaries which can be 
carried out by the first protocol can also be carried out 
by the second, though the converse is not true (e.g.. Ex- 
ample 5). We have constructed some examples for both 
protocols. 

Note, however, that we have not been able to answer 
the fundamental question as to precisely which unitaries 
can be carried out exactly using a fast protocol and a 
fixed entanglement resource specified in advance. We do 
not know the answer even for fast protocols of the two 
types considered in this paper. Finding examples for our 
double-group protocol is not at all trivial; see the dis- 
cussion in Sec. IIIIBI In Sec. Ill CI we discussed cases in 
which subsets of a group can be used to carry out a fast 
controlled unitary protocol at the cost of greater entan- 
glement. See in particular Example 3, where we showed 
that any unitary in a particular continuous family can 
be approximated arbitrarily closely by a unitary imple- 
mentable by a deterministic fast protocol, provided one 
is willing to use up enough entanglement. This is similar 
in spirit to the results in fTT| and [T2|. Their protocols 
may need less or more entanglement than our protocols, 
depending on the form of the unitary. In certain situa- 
tions (e.g.. Example 5), our protocol uses the minimum 
possible entanglement. It would be nice if these issues 
could be clarified in terms of some basic principle(s) of 
quantum information theory. 

Another question we have not been able to answer is 
whether unitaries of the more general form J^f^if) ® 
W{f), where the U{f) form an ordinary or projective 
representation of a group, but W{f) need not do so, can 
be carried out by means of a fast protocol. A slow pro- 
tocol was found in our earlier work , and we have con- 
structed a fast version for that protocol, but it seems to 
only work for those unitaries implementable by our fast 
double-group protocol of Sec. IIIII Again, this may reflect 
some fundamental principle of quantum information, but 
if so we have not been able to identify it. 
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Appendix A: General considerations in protocols 

In this section, we consider implementation of U on 
v. A ®'Hb using a different W and use these ideas below 
to show that consideration of controlled unitaries can be 
restricted to those with rank-1 projectors. A scheme is 
shown in Fig. [Sj which is valid for both the fast and slow 
protocols. If the protocol for U' is fast, then the whole 
protocol is fast. 

The circuit in Fig. [5] can be used for the following two 
situations, to be discussed in detail in the two subsec- 
tions below. The first situation, called "extension", is 
that we extend the space of "Ha to T-Le', and the uni- 
tary W : %E' ^'Hb "Hb' (8) "Hb is an extension of U, 
where U is any unitary on Ha ^Hb- The second sit- 
uation, called "compression with extension," is only for 
general controlled unitaries U of the form ([T]) . The proto- 
col replaces the higher-rank projectors on Ha ^nU with 
rank-one projectors on TiE' in W, while adding more pro- 
jectors if needed. Applications are found in Sec. [Ill Note 
that while Fig. [S] shows an extension (and compression 
in the case of controlled unitaries) on the A side, one can 
just as well do this on the B side, or both sides. 

The input state for the whole protocol is any state on 
AB together with some fixed state on the ancilla E de- 
noted by |0)£;. The map S -.'Ha^'He ^ V-A' ® Tis' is 
unitary, and S*^ is its inverse. The unitary S obviously 
has its input dimension equal to its output dimension: 
d-AdE = dA'dE', where ds' is determined by U' (see be- 
low), dA may be unequal to dA' , and dE may be unequal 
to dE'- 

1. Extending the Hilbert space in protocols 

Here we consider the first type of extension, where W 
is the direct sum of Uq (the same as U but on a different 
space) and another unitary TZ. Dimension ds' is fixed 
by U' and is greater than dA- One can always choose dE 
to be equal to dE', but in general may choose dE to be 
less than dE'- The action of the unitary S on the actual 
input space is determined by the following equation: 

S{\k)A ® \0)e) = \0)a' ® \k)E', fc = 0, 1, • • • , - 1, 

(Al) 

where {|fc)^} is an orthonormal basis of Ha- The re- 
quirements for S in this equation can be extended to 
a full definition of a unitary. The effect of S is to 
transfer Alice's input state into T-Le'- Define Ha to 
be the span of {\k)E' : < fc < d^i — 1}. Then 
He' ^ H-A ffi where Hr is a space orthogonal to 
Ha- Then the correct form of W should be W =Uo(B TZ, 
where Uo : Ha ®Hb^ ^-a ® is the same as the 
original unitary U except it is on a different space, and 
TZ : Hr (8) Hb — >■ He, ®Hb is an arbitrary unitary. 
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FIG. 5. A scheme for implementing U using a protocol for U' . 



Now we prove that the circuit defined in Fig. [S] applied 

to |0)£; ® \^)ab yields \Q) e ®U\tlj) ab- 

Proof. Suppose the input state on T-Lab is \'iP)ab = 
\k)A\q)B- Then 

\k)A\0)E\q)B A \0)A-\k)E'\q)B 

^ \0)A-®U'{\k)E'\q)B) 
= \0)A-®U{\k)E'\q)B) 
dA — ldB—l 

= |o>A'® E {3PMkqmE'\p)B 

3=0 p=0 

= |0)£(»Z^(|fc)^|g)B) (A2) 

The argument can be extended by linearity to superpo- 
sitions of the states \k)A\q)B- □ 

As a side remark, the derivations above should still 
work if we replace the unitary S by an isometry V = 
S\Q)ei replace S'^ by V^, and remove system E from the 
circuit in Fig. [5] It can be verified that the overall opera- 
tion of the circuit is [V^ ®Ib){Ia'®W){V®Ib) We 
chose to present the argument using the unitary S rather 
than the isometry V in order to show that the scheme 
has no trouble finding an experimental implementation. 
The same remark also applies to the next subsection. 

The current extension technique was useful in finding 
the protocols in Sees. Ill Cl aud Example 8 in llll C[ but it 
turns out that those protocols (for the particular types 
of unitaries) can be simplified such that no extension is 
needed, which is why we have not explicitly mentioned 
this idea of extension in those sections. 



2. Controlled unitaries: Conversion of higher rank 
projectors into rank-one projectors 

For general controlled unitaries U of the form ([T]) (not 
limited to those implementable by the fast protocols in 
this paper) , we now consider a procedure that compresses 



the higher-rank projectors on T-La into rank-one projec- 
tors ohHe' , while adding more projectors if needed. The 
form of W is 

N'-l 

W =Y, \k){k\E'®Vk (A3) 

fe=0 

where N' > N . Apparently dE' ^ N' . 

The steps of the protocol are similar to those in Ap- 
pendix [Xl] but with the following change to the require- 
ments on S (and accordingly S*^): 

S{\k,r)A®\0)E) = \r)A'®\k)E', (A4) 

where {k, r) is the label for the states in a specific basis 
of T-La: with fc (0 < A: < TV — 1) labeling which projector 
Pfc, and r G {0, 1, • • • , rank(Pfe) — 1} labeling which basis 
state in the support of Pk- Note the range of r depends 
on k, and because of this, cLa' should be at least the 
maximum rank among the Pfc's (0 < fc < iV — 1), while 
satisfying (IacIe = dA'dE'- The requirements for S in 
(jA4p can be extended to a full definition of a unitary. The 
effect of S is to transfer the information about "which fc" 
into H E' 1 and that information is used in the controlled 
unitary W, and then transferred back to Ha by S'^ . 

The final state of the protocol is \Q)e ^1^\iP)ab, and 
the proof for the correctness of the protocol is similar to 
that in Appendix I A II 

Proof. Suppose the input state on Hab is \iP)ab = 
\k,r)A\q)B- Then 

\k,r)A\0)E\q)B A \r)A'\k)E'\q)B 

^ \r)A'®U'i\k)E'\q)B) 
= \r)A' (g) \k)E' (g)Vk\q)B) 

^ \0)E<»\k,r)A<S)Vk\q)B 
^\0)E(^U{\k,r)A\q)B) (A5) 

The argument can be extended by linearity to superpo- 
sitions of the states \k,r)A\q)B- □ 

As noted above, for the current subsection it is also 
plausible to replace the unitary S by an isometry V = 
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S\0)e = EfcErk)A'|fc)i5'(fc,rU, replace by V^, and 
remove system E from the circuit in Fig. [S] Using the 
definition ofW in (jA3p . it can be verified that the overall 
operation of the circuit is {V^ (>~)Ib){Ia' ^i^')iV®lB) = 
where U is of the form ([T]) . 

Appendix B: Proofs of Theorems [H [H [3] 
Proof of Theorem [TJ 

(a) We need to show [see HH) and note that both D 
and Zi are diagonal matrices] that 

Pi = CZiC^ = {l/N)MkDZiD'^k'^M'^ 

= {l/N)MkZik^M'^ (Bl) 

is a complex permutation matrix. The diagonal elements 
of Zi are complex conjugates of the elements in a row 
of T and thus [Eq. (fTS]) ] equal to a common phase fac- 
tor times those in a particular row, say row m, of the 
character table k] recall that the complex conjugate of 
a row in K is always another row of k. Now the matrix 
KZi is the matrix k with each column multiplied by 
the corresponding diagonal element of Zi . Thus the j-th 
row of KZi is the element-wise product of row j of K 
with row m of up to an overall phase that depends on 
j. But since the rows of k form a group under element- 
wise products, this means that kZi = Qik for a suitable 
complex permutation matrix Qi. Since kk'^ = NI, (jBip 
teUs us that Pi = MQiM\ and because M, Qi, and AP 
are all complex permutation matrices, so is Pi. 

(b) The rows of T are the elements of a group H in 
the following sense. The element-wise product of any two 
rows is, up to a phase, a third row, and the fact that the 
rows are linearly independent means that this third row 
is uniquely determined. That is, there is a well-defined 
associative group multiplication, which is commutative, 
so the group H is Abelian. There is necessarily one row 
consisting of identical elements; this is the identity ele- 
ment of H. Given any row, there is another row which is, 
element by element, its complex conjugate, up to a single 
phase for the whole row; these two rows are inverses of 
each other. Hence the group H is well defined. Obvi- 
ously, each column of T consists of elements (viewed as 
1x1 matrices) that form an irreducible representation of 
H under the multiplication of complex numbers, and all 
the elements of T are of magnitude 1 . 

Divide each row by its first element to form the matrix 
T' , whose rows again form the group H, but now with- 
out additional phases since the first element of each row 
is 1. Since the rows of T are linearly independent, so are 
the rows of T', and hence also its columns. Thus each 
column of T' forms a distinct irreducible representation 
of the group H. All the N distinct irreducible repre- 
sentations of H are included in the columns of T', thus 
T' is the transpose of a character table of the Abelian 
group H, and such a transpose is itself a character table 
of H (see the discussion preceding Theorem [T]). Hence 



we can identify T' as the k in (ITSt , and the phases used 
to change T to T' can be included in the matrix L in 
(HSI). □ 
Proof of Theorem [H 

Let C = vW C and rewrite in the equivalent form 

CZi = PiC. (B2) 

The matrix on the left is obtained from C by multiply- 
ing each column by the corresponding diagonal element 
of Zi, while the one on the right is obtained by some 
permutation of the rows of C, with an additional overall 
phase for each row. Consider the special case in which 
the first row of C consists of I's. Then the first row of 
CZi is Z;, the row vector whose elements are the diag- 
onal elements of Zj, and according to (|B2p . it is equal 
to the first row of PiC (i.e., a phase times some other 
row of C). As the Z; are linearly independent (they are 
complex conjugates of the rows of the Hadamard matrix 
T), it follows by equating, for each Z, the first row on 
the left side of (jB2p with that on the right side, that the 
element-wise product of the first row of C and the Z; for 
all possible values of I generates all the rows of C up to a 
phase (i.e., each row of C is a phase times one of the Zi). 
Then according to (|B2p , the element-wise product of any 
row of C with any Z; (i.e. the element-wise product of 
any two rows of (7 up to a phase) , is always a third row of 
C up to a phase. Similarly the product of any two of the 
matrices in the set {Z;} is a third, up to a phase; this is 
an associative group product. The first row of C is equal 
to one of the Z; up to a phase, and this Zi corresponds 
to the group identity. Because the first row of C consists 
of I's, there is a row of PiC, say row to, which consists of 
equal elements. Then the m-th row of C [see (|B2p ]. iden- 
tifies the group inverse of Z;. Thus the Z; under matrix 
products form a group (denoted by H) up to phases. For 
any Z;, its complex conjugate Z^* is the matrix inverse 
of Zj, and hence some Zy up to a phase. Consequently 
all the different rows of T, the complex conjugates of Z;, 
are equal to the different Zi up to phases and a permu- 
tation of the ordering, hence these rows form the group 
H up to phases under element-wise multiplication. Then 
according to Theorem [TJb) , H must be Abelian, and T 
is of the form given in ([15]). Since the rows of T are a 
permutation of the different Zi up to phases, they are 
also a permutation of the rows of C up to phases. Thus 
C is of the form (jlSp with D ^ I. Hence for the special 
case under consideration, ([T5|) is satisfied, and then ([THj) 
follows from 

Next consider the more general case in which the ele- 
ments of the first row of C are all nonzero. Form C' from 
C by dividing every column of C by the corresponding 
element on the first row. This means that C" = CQ, 
where Q is a diagonal matrix. Since Q commutes with 
every Zi, we can replace C on both sides of (|B2p with 
C' . As the first row of C" consists of I's, the argument 
given above shows that the {Z;} form an Abelian group 
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up to phases, and the rows of C" are, up to phases, some 
permutation of the rows of T, and T is of the form given 
in (fTSjl , again according to Theorem [Tfb) . Thus the dif- 
ferent columns of C" all have the same normalization, 
and since the same is true of C, as C is assumed to be 
a unitary matrix, it follows that the diagonal elements 
of Q all have magnitude 1, and one can set D = in 
(fT5l) . Then (HH) follows from 1^. □ 

Proof of Theorem [3j 

The coefficients c(/) must all be of the same mag- 
nitude, (see the remark following Theorem [T]), 
and without loss of generality 7 can be chosen such 
that c(e) = 1/VN in (pOj) . where e is the group iden- 
tity. First consider the case that {r(/)} is an ordi- 
nary representation of G, so that \{g,h) — 1. Choose 
any group element r ^ e and assume it has order p, 
which means that p divides N and = e. The first 
row of C, corresponding to 5 = e in (|13p . contains 
c{e), c{r), c(r'^), ■ ■ ■ ,c{rP^^) in some order, interspersed 
with other coefficients c(/). Now consider the row of C 
corresponding to g ~ r^^^ = r^^ in p^ . It is related to 
the first row as indicated here, 



c(e) c(r) c(r^) ••• c{rP ^) 
c{r) c(r^) ••• c(rP^^) c(e) 



(B3) 



where only the relevant columns are shown, rearranged 
in a convenient order. 

Let us define the C" matrix, as in the first paragraph 
of Sec. IIII Bi to be the one obtained from C = C by 
multiplying each row and each column by some phase, so 
that all the elements in the first row and first column are 
equal to 1. This is a character table, so every element is 
an A''-th root of 1. Equivalently, C' is obtained from C by 
dividing each column of C by the corresponding element 
in the first row, and then in the resulting matrix dividing 
each row by its first element. Consequently, applying 
this process to the rows and columns shown in (jB3p . we 
conclude that 



c(r2) 
c(r)2 



c(r)c(r2) 
c(e) 



= 02 V TV, 

^p-iVn, (B4) 



where each (f>j is an iV-th root of 1. The product of these 
p — 1 equations, 

c{e)/[c{r)]P = 0102 • • • 0p-i(VA^r-\ (B5) 



implies, since c(e) = l/vN, that \/Nc{r) must be a, p- 
th root of a number which is itself an iV-th root of 1, 
and because p divides iV, c{r) is of the form ([20]) . This 
completes the argument for an ordinary representation. 

When the {r(r)} form a projective representation of 
G with a standard factor system (fTTj) . the first row in 
(jBSp is the same, but the second row will be multiplied 
by appropriate factors X{g,h). Since we are assuming 



a normalized factor system, all these additional factors 
are themselves A^-th roots of 1, so (IB4I) still holds for 0, 



which are A^-th roots of 1, and the rest of the argument 
is the same as before. □ 



Appendix C: Proof of Theorem |4] 

In this appendix, we prove Theorem|4l which says that 



the controlled unitary U = Y^ 



N-l pA, 



given by (P), 



where are orthogonal projectors, and {V^f} form a 
subset of an ordinary representation of an Abelian group 
G, is equivalent to 



(Cl) 



under local unitaries, where c(/) are complex coeffi- 
cients [will be defined in (IC3p ]. and are linear 



combinations of P^, and {Q(/) ® R{f)} is an ordinary 
representation of G. In addition, the c(/) can be chosen 
to satisfy the requirements for the fast protocol, hence all 
controlled-Abelian-group unitaries can be implemented 
by the fast double-group unitary protocol. 

We first prove the case that {V^f} form a whole rep- 
resentation, not a subset, and at the end we will re- 
mark that the proof also works in the "subset" case. 
The proof is by explicitly constructing a W and show- 
ing that it is equivalent to U under local unitaries. Any 
Abelian group is a direct sum of cyclic groups, so G = 
Gri ffiGra ©• ' '©G^^ , whcre 77 > 1 , and is the order of the 
cyclic group C^- Then N — \G\ = HlLi '"i- The group 
element k is relabeled by a vector k — (fci, /c2, • • ■ , fc^), 
where < fc^ < — 1. We use the convention that k is 
the sequential numbering (starting from 0) for the vec- 
tors in lexicographical order, so that fc = corresponds 
to (0, 0, ••• ,0), and fc = 1 corresponds to (0, O,--- ,1), 
etc. Suppose {V^f} has been diagonalizcd under a suit- 
able unitary similarity transform, then {V^^} is the direct 
sum of some irreducible representations (possibly with re- 
dundancy). All possible irreducible representations of G 
are one-dimensional, and have the form 



(C2) 



where q = (91, 92, • ' ' ; 9);) is the label for irreducible 
representations (some may be missing from {Vjf}, but 
we still include them in this labeling scheme for conve- 
nience). Denote the computational basis of by 
6 = 0, 1, • • • , — 1, then the 6-th diagonal elements in 
V^P determine an irreducible representation labeled by 
9fc, < <7b < — 1. The qi, can be written in the vec- 
tor form [sec (|C2p and the sentence after that], and the 
components in the vector qj, will be denoted by qh s. 

As discussed above, for every / G G we can repre- 
sent / using a set of integers: / = (/i, /2, ■ ■ • , /ij), with 
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group multiplication corresponding to vector addition 
(modulo for the s-th element of the vector). Define 
= nLi where Cs{fs) is defined by (basically 

the same as in Example 6) 



Csifs) = —=ex.p[-Trifs{rs mod 2 + /s)/rs], 



= 0,1,--- ,r, -1. (C3) 



We choose the Q{f) to be 



Qif) 



1 



E 

k=0 



exp{-2TTifsks/r 



.s=l 



pA 



(C4) 



where {kg) is the vector labeling for k. Define R{f) as 



dB-l 



b=0 



Y]^ exp{-2TTifsqb,s/rs) 



\b){bl (C5) 



where qb.s are the components in the vector labeling for 
Qb- It is not hard to verify that {Q{f) (8) R{f)} is an 
ordinary representation of G, and the coefficients c(/) 
form a unitary C matrix of the type l|13p . hence W is 
unitary. 

With the above choices of Q{f) and R{f), 



W\k)\b) 



-y^ E ( n exp[-7ri/,(r, mo 

/=0 \s=l 



d2 + fs+2ks + 2qb,s)/rs] \k)\b), 



(C6) 



where 0<fc<A^— 1, 0<5<c?b — 1, and \k) is any eigenstate of P^. Denote the phase factor in front of |fc)|6) in 
the above equation by C.kb, then 



rj — 1 



Ckb = -^YiYl ^""P ('^^-f^t-?' + + 1b,s + (rs mod 2)/2]2 + [ks + Qb^s + (r. mod 2)/2]2}/r,) 



n 

s=l 

»? 



y cxp{-7ri[j + ks + qb,s + {ts mod 2)/2]^/rs} exp{7ri[fcs + qb,s + {rs mod 2)/2]^/rs} 



^ Yl exp{ni[ks + qb,s + {rs mod 2)/2]Vr4 

^ s=l 



(C7) 



where a is a constant independent of k and q. In deriving 
the last line above, we have used {r+j)'^ = j^{mod 2r) for 
even r, and (r+j + 1/2)^ = (j + l/2)^(mod 2r) for odd r, 
which make the substitution j+ks+qb.s ^ j possible, and 
obtained a = 11^=1 '^s, where Os — X^jLo exp{— 7ri[j' + 
{rs mod 2)/2]VrJ. 

Define the local operators Ma and Mb on Ha and H b , 
respectively, as follows: 



Af-l 



k=0 
dB-l 

Mb = Coo E Cob'\b){b\. 



(C8) 



b=0 



From the unitarity of W, C^f, is always a phase factor with 
magnitude 1, hence Ma and Mb are unitary operators. 
Then for \k) chosen arbitrarily from the eigenstates of 



Pj^, we have 

{Ma ^ MB)W\k)\b) = {Ma ® MBKkb\k)\b) 
= CooCkbCkoCob'mb) 

Y[ exp{2mksqb,s/rs) 

.s=l 
k=0 



\k)\b) 



M\k)\b) 



(C9) 



where we have used (IC7[) to derive the third line, and 
used (jC2p to derive the fourth line. Since P^ are of 
finite rank, there exists a finite collection of states of the 
form |A;)|&) to make a complete basis of Has- The actions 
of {Ma €5 Mb)W and U are the same on all states in a 
complete basis, hence they must be identical operators. 
Therefore U is equivalent to W under local unitaries. 
Using the algorithm in Sec. lIII I3l it can be verified that 
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the choice of coefScients c(/) given above (which can be 
viewed as the choice in Example 6 generaUzed to the non- 
cychc Abelian groups) satisfies the requirements for the 
fast protocol. Hence the double-group protocol for W is 
fast. 

The proof above can basically be applied to the case 
that {VjPI form a subset of a representation. In general 



some do not occur in the expressions for Q{f) and 
Ma; those can be safely removed because Q{f) and 
Ma are block diagonal, where the blocks are determined 
from the support of the P^^s. The coefficients c(/) are 
still the same as above, so the protocol is still fast. Hence 
the proof still works. □ 
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